1 
1 



I 

On the Design of Visual Behaviors for Autonomous Systems 



Jose Santos- Victor, Alexandre Bernardino, Cesar Silva 

Instituto Superior Tecnico/Instituto de Sistemas e Robotica 
1ST - Torre Norte 
Av. Rovisco Pais, 1 
1096 LISBOA CODEX, Portugal 
e-mail: { j as v,ajmb,cesar}@isr. ist.utl.pt 



Abstract - We describe a set of visual behaviors 
developed over the years, in the general context of 
robot navigation. The first set of behaviors solve 
three basic problems in vehicle navigation: obstacle 
avoidance, docking to a surface and moving along 
corridors or walls. Another set of visual behaviors 
were developed for the active stereo head Medusa, 
with the purpose of tracking objects with arbitrary 
shape or motion. Finally, we present an approach for 
egomotion estimation assuming arbitrary motion for 
the camera. All behaviors use purposive visual in- 
puts, based on similar image measurements (image 
flow) but computed on different regions of the visual 
field. Goals are attained without camera calibration 
or environment, reconstruction. 

I. INTRODUCTION 

Many animals perform numerous and repetitive tasks 
without the apparent consciousness of the actions in- 
volved. The mental processes involved in directing hu- 
man locomotion look, at least in adult humans, surpris- 
ingly simple and do not seem to require a conscious 
activation of different processes or the explicit compu- 
tation of environmental measures. All these processes 
seem to be happening "automatically" and they do in 
fact occur in parallel with other menial processes. In 
principle they cannot be considered "reflexes" (at least 
ill the sense physiologists define them) because the same 
sensory input can cause rather different motor responses 
according to what the person is doing and the motor re- 
sponse can be voluntarily suppressed. However it is fair 
to say that the above mentioned processes seem to be 
running in our brain without a constant conscious su- 
pervision. In humans these behaviors are certainly the 
result of a developmental process during which a map- 
ping between sensing and acting is built, giving rise, for 
example, to a sensory-motor representation of locomo- 
tion. Once these representation is learned, usually with 
trial-and-error procedures, the emerging behaviors be- 
come part of the daily routine. 

What is largely learned in humans often comes as 
a reflex in so-called lower level animals which cannot 
afford a long period of learning before becoming "au- 
tonomous". Some of the approaches presented in this 
paper have been, in fact, inspired by insect behavior 
and is aimed at building a library of embedded visually- 
guided behaviors coping with the most common situa- 
tions encountered during autonomous navigation. 



From the perceptual point of view, the approach has 
its roots in the paradigm of active perception [3, 4, 2] in 
the sense that the behaviors arc based on active explo- 
ration of the environment and take advantage of both 
the structure of the environment and the purpose of the 
task to be solved [1). 

One of the most fundamental assumptions used 
throughout the paper is the continuous use of visual 
information during the action [6, 10]. The second is 
the possibility of designing a set of closed-loop visually 
guided behaviors whose goal is solely that of controlling 
direction of heading and forward velocity of the navi- 
gating actor (the goal of perception is to act), or con- 
trolling the degrees of freedom of a binocular head. The 
solutions presented do not, of course, solve the problem 
of system autonomy. Our goal, on the contrary has been 
to implement a set of behaviors which can economically 
solve a set of low-level (yet complex) problems, freeing 
the overall system (yet to be implemented) from some of 
the routine tasks encountered in autonomous systems. 

II. APPROACH 

A robot moving in a indoor environment safely requires, 
as basic skills, the ability to detect and avoid obstacles, 
to navigate along walls and corridors, to enter doors and 
approach surfaces with a given orientation (e.g. when 
entering an elevator or docking in a given place to col- 
lect or deliver materials), determine its own motion pa- 
rameters and to track moving objects in the scene. The 
behaviors described here, even if not integrated yet, do 
in fact cover almost all of these requirements and they 
do it by using similar visual measures (which is good 
from the economical point of view) purposively used to 
control the motion variables specific to each behavior. 

An important aspect is the way we use the optical 
flow to pursue our goals. Estimating the full optical flow 
field is an underconstrained problem due to the aper- 
ture problem, since with local image measurements we 
can only determine the flow component along the image 
gradient direction - the normal flow. In our approach, 
we exploit the particular nature of each behavior to con- 
strain the required optical flow computations. 

In Section 111, we present the various behaviors re- 
lated to navigation. The centering- reflex, observed for 
freely flying honeybees, is used to drive a robot along 
corridor or wall like environments, while controlling the 
forward speed and avoiding lateral obstacles. 

Then we describe an obstacle avoidance behavior 



IEEE Catalog Number: 97TH8280 - SS53 - 1SIK'97 - Guimaraes, Portugal 



BNSDOCID: <XP 102651 39A_I_> 



k i ™T pers P ectlve mapping and a set of 
docking behaviors to control the robot to a certain point 
m the scene, aligning itself perpendicularly to the dock- 
ing surface and controlling the velocity 

In Section IV, we describe an approach for binocular 
tracking based on log-polar images that combines ver- 
gence and pursuit behaviors. Binocular disparity and 
motion cues are used to control the 4 degrees of free- 
dom of our robot head. 

Finally, in Section V we illustrate how the normal flow 
can be used to estimate the robot ego-motion which 
can be integrated in a closed loop control strategy We 
rely exclusively on the information of the normal flow 
to recover the robot linear and angular velocity 

Therefore, by considering the specificities of each of 
the behaviors, we can tailor the visual processing re- 
quired to achieve the control goals, and therefore over- 
come some ol the limitations related to the aperture 
problem Additionally different attentional areas are 
used for the various systems (e.g foveal versus peripheral 
vision) or in some cases attention is directly embedded 
on the sensor geometry. 



III. NAVIGATION 

We have designed of a set visual behaviors to solve some 
relevant problems m autonomous navigation: navigating 
along corridors or walls, obstacle avoidance and docking 
to a surface in the environment. 

III.l. Centering Behavior 

The first visually guided behavior is the centering reflex, 
described in [19] to explain the behavior of honeyoees 
flymg within two parallel "walls". The qualitative visual 
measure used ,s the difference between the image veloc- 
ities computed over a lateral portion of the left and the 
right visual fields, as described in [11]. 

One of the major driving hypothesis is the use of qual- 
itative depth measurements. Additionally, the goal of 
the visuo-molor controller is limited to the "reflexive" 
evel of a navigation architecture acting at short-range. 
In spite of these limitations, we will demonstrate a va- 
riety of navigation capabilities which are not restricted 
to obstacle avoidance or to the "centering reflex". For 
example ,t has beer, sh own the possibility of overcom 

caLed b K ° r V^" 1 abSeaCe ° f fl ° W i«^matio„, 
caused by absence of texture, or by localized changes in 
env^onmental structure (e.g. an open door along a ca- 
ndor) A n important remark is that the system does not 
critical y rely on the accuracy of the optica] flow compu^ 

lr. : c^ed^oT mea ~ ntS « br uL 

Real-time Control 

The overall structure of the robot control system in- 
volves two mam loops. The Navigation loop controls the 

ottT I SPe6d ° rder t0 balance left 'ighl 
optical flows, hence maintaining the robot at similar dis- 
tances from structures on the right or left sides The 



difference between the left and right flow vectors (along \ 

JeLtT v d, r Cti ° n) iS USCd t0 COntro1 «» rob"! 

ntd t The l Veloc '^ "P controls the robot forward \ 
2^ acc / ^a mg/decelerating as a function of the . 

™ P £ ,l the , 1 fl ateral fl ° W fidds - The robot ab- 
ates if the lateral flow ,s small (meaning that the walls 

are far away), and slows down whenever the flow be- 
comes larger (meaning that it is navigating in a narrow 
environment). The mean flow vector on each side of the 
robot gives a qualitative measurement of depth 

Additionally, a sustaining mechanism is implicitly 
embodied in the control loops to avoid erratic behaviors 
of the robot, as a consequence of localized (in space and 
time) absence of flow information. It allows the use of 
the robot in environments far more complex than cor- 

^!zz:t n the ,<waIls " are not 

To overcome these problems, we have introduced in 
he control strategy a mechanism able to cope with uni- 
lateral lack of flow information. Whenever it occurs, the 
control system uses a reference flow that should be sus- 
tained onth feeing" camera (i .e. the camera still mea- 
suring reliable flow). Hence, the robot will follow the ip- 
silateral wall at a fixed distance. This simple qualitative 
mechanism extends the performance of the system to a 
much w,der range of environmental situations. 

Results 

In all the tests, the robot trajectory was recorded from 
odometric data during real-time experiments. Figure 1, 
shows the robot trajectories superimposed on the ex- 
perimental setup, for various types of environment 







































1 
























i-i -a.a 



To test the velocity control, we considered the fun- 
rid or bee" ' eXample> W L th Varyi "« Width - A * ««• cor- 
t dor becomes narrower, the average flow increases and 

LUe usiL7h tro1 , node forces the robot 

127* g V6lOClty C ° ntro1 ' the finaI *™ of the 
the robot to make a softer, safer maneuver. 



IEEE Catalog Number: 97TH8280 



-SS54- 



ISIE'97 - Gulmarfics, Portugal 



10265139A_I_> 



With the introduction of the sustained behavior, the 
robot is able to navigate in a much wider set of environ- 
ments. In fact, only one textured wall is needed for the 
navigation strategy to work. 

III.2. OBSTACLE DETECTION 

The navigation system described before is unable to de- 
tect obstacles located ahead of the robot. Here, we de- 
scribe a system for obstacle detection which uses a sim- 
ilar input to avoid obstacles. It exploits the geometric 
properties of the vehicle- camera-scene arrangement, and 
does not depend on the knowledge of the camera param- 
eters or vehicle motion, as described in detail in [12]. 

The basic assumption is that the robot is moving on a 
ground floor (as it was t he case of the centering behavior 
presented before) and any object not lying on this plane 
is considered to be an obstacle. The method is based on 
the inverse projection of the flow vector field onto the 
ground plane, where r,he analysis of the flow pattern is 
much simplified. 

As opposed to other systems, an important feature 
is that the knowledge of the vehicle motion is not re- 
quired and under certain circumstances, the approach 
is independent of the camera intrinsic parameters. 

Inverse Perspective Mapping 

The main idea is the analysis of the particular structure 
of the optical flow field, when a robot with a camera is 
moving over aground plane. In Figure 2, Pc is the image 
plane of a camera moving forward with pure translation, 




on the pavement is constant. All the vectors are equal, 
and points lying above or below the ground plane can 
be easily detected, as obstacles. 

The method operates in two steps. Initially, the robot 
moves with pure translation and the projective trans- 
formation between the image and ground planes is esti- 
mated, without the need to calibrate the camera. Dur- 
ing normal operation, the normal flow field is inverse 
projected onto the horizontal plane, where obstacles are 
easily detected. 

The planarity assumption is used to approximate the 
flow field of the pavement to an affine mapping. The 
afRne flow parameters are then used to estimate the 
projective transformation. The more salient features are 
the exclusive use of the normal flow information; and 
that knowledge about the vehicle velocity, or camera 
parameters is not required. Apart from the initialization 
phase, the system can deal both with rotational and 
translational motion, of the robot, and the method is 
appropriate for use in real-time. 



Results 

The method was tested extensively using synthetic and 
real image data, and in real time on a mobile robot. The 
camera was placed in the front part of the robot facing 
the ground plane with an angle of about 65 degrees, with 
no calibration. The robot speed was set to 10 cm/s. The 
running frequency of the vision loop is around 1Hz. 

Figure 3 shows an example of the normal flow field 
measured during the robot motion, with an obstacle in 
the field of view. The rightmost image of Figure 3 shows 
the image regions where points lying outside the ground 
plane, have been detected. When the robot is undergo- 
ing pure linear motion, we have simply projected the in- 
verse mapped flow in the y direction and subtracted the 
median flow. As the robot we used has a single forward 
component of the linear velocity, we have also estimated 
the rotation component and canceled this term from the 
overall inverse projected flow and then, the same detec- 
tion process was applied. In both cases, similar results 
were obtained (see [12] for details). 



Fig. 2.: Inverse perspective mapping. The coordinate systems 
(C) and (H) share the same origin even though in the pic- 
ture they have been designed separately for simplification. 
While on (C) the motion of the ground floor is perceived as 
a complex vectorial pattern, in (H) all the vectors have equal 
length and orientation under translational motion. 

while facing a planar pavement. Even with this simple 
arrangement, the resulting flow pattern is complex. This 
is due to the perspective effects, and makes the problems 
of motion analysis or obstacle detection difficult. 

The idea is to inverse project the flow captured on 
the image pane Pc onto the horizontal plane Ph, as 
suggested in [8]. In this coordinate plane, the flow pat- 
tern becomes much simpler as the distance to any point 











































V: . 

















Fig. 3.: Left: sample of the ground plane normal flow field, 
during the robot motion. Center: resulting inverse projected 
flow. Right: detected obstacle. 



The resolution of the overall system determines the 
minimum obstacle size that can be detected, and 
strongly depends on the image resolution. If more com- 
putational power is available, the image resolution can 



IEEE Catalog Number: 97TH8280 



- SS55 - 



1SIE'97 - Guimaraes, Portugal 



BEST AVAILABLE COPY 



BNSDOCID: <XP 10265139A__I_> 



be increased, the same happening with the global sys- 
tem resolution. 

III.3. DOCKING 

In this section, we introduce vision- based docking 
strategies for indoor mobile robots, where the robot 
should approach a surface, along the surface normal 
with controlled forward speed, until it finally stops. 

Two distinct situations for the docking problem are 
considered. ]„ the ego-docking, the camera is mounted 
on board the vehicle, and the robot egomotion is con- 
trolled during a docking maneuver to a particular sur- 
face in the environment. In the second scenario, that 
we call eco-docking, the camera and computational re- 
sources are installed on a single external docking station 
with the ability to serve multiple robots. Both scenarios 
are illustrated in Figure 4. 



!. 




1 



Fig. 4.: Left: In the ego-docking, a robot, equipped with a 
camera and compntmg resources, docks to a surface Riehf 
m the eco-docking, the camera is attached to a single docking 
station which may serve multiple robots. 

From the perceptual point of view, both situations are 
quite similar since the important issue is the relative 
motion between the camera and the docking surface 
However, m the ego-docking, the camera position with 
respect to the robot is fixed, whereas in the eco-docking 
it is changing continuously, thus posing new problems 
for the visuo-motor control loop. By proper formulation 
of the problem [13], exactly the same control architec- 
ture can be used in both cases. 

The objective of the control system, both in the ego- 
dockmg and the eco-docking problems is twofold The 
Heading control system aligns the camera axis and the 
docking surface normal, during the docking maneu- 
ver. The robot approaches the surfaces perpendicularly 
Then, we use the Time to crash information to control 
the robot forward speed to slow the robot down when 
approaching a wall. 

As before, this behavior uses the planar surface as- 
f,T P ^,° n t0 a PP roximate th e now by an affine mapping 
[13J. The affine optical flow parameters are estimated 
irom spatio-temporal image derivatives, and used di- 
rectly to control the robot motion. One of these pa- 
rameters is inversely proportional to the time to colli- 
sion, and should be kept constant during the maneuver 
slowing the robot when it approaches the goal A sec- 
ond parameter vanishes when the proper orientation is 
reached, and can be regulated to zero to control the 
robot heading. The relation between the affine parame- 
ters and the robot linear and angular velocities is similar 
both for the ego-docking and eco-docking. Hence, apart 
trom a sign inversion in the rotation controller, the same 
controller can be used for both situations 



Results 



Figure 5 describes a typical ego-docking experiment 1 
showing the robot trajectory (recovered from odometry) ' 
during the maneuver. Initially there is misalignment of 
about 45° between the robot heading and the direc- 
tion perpendicular to the docking surface. During the 
maneuver the robot describes a smooth trajectory and 
aligns the camera axis with the direction normal to the 
surface, while controlling the forward speed Figure 5 




Fig 5.: Robot trajectory during a real ego-docking maneuver 
and time evolution of the robot heading direction, in degrees. 

shows also the time evolution of the robot heading di- 
rection during the maneuver. 

Also in the eco-docking, the system has revealed a 
robust behavior and we have made several tests using 
ddferenl initial positions for the robot. We have used 
the same controller apart from a sign inversion in the 
rotation control law. Figure 6 shows the robot trajectory 
during a typical eco-docking experiment together with 
the evolution in time of the robot speed. 









1 
























L 




































i 





Ha 'f^ ROb °. t ,* raje< ? 0 f y du / in 5 an ^'docking maneuver 
and time evolution of the robot forward speed in [mm/s]. 

While the robot is far from the docking station, the 
speed control loop originates an increase of the robot 
velocity until a cruise speed. When the robot gets closer 
the speed decreases. 

The angular and position errors in the maneuvers are 
in the range of a few degrees (typically up to 5°) in 
orientation and a few centimeters (typically up to 5cm) 
m the distance to the docking surface. These errors are 
mainly due to the low resolution of the images we use, to 
the relatively low sampling frequency, and to mechanical 
problems in the platform when the speed is verv small 



IEEE Catalog Number: 97TH8280 



-SS56- 



ISffi'97 - Guimaraes, Portugal 



BNSDOCID: <XP 10265139A_I_» 



IV. BINOCULAR TRACKING 

Many robot and computer vision problems can improve 
their performance by tracking objects in the visual field. 
The tracking system presented in this paper is imple- 
mented on an active vision stereo head, and follows some 
aspects motivated by the structure and functionality of 
biological visual systems, which show several advantages 
over other more straightforward approaches. In particu- 
lar the use of binocularity, individual ocular movements, 
a space variant image representation, and visuo-motor 
strategies based on low-level visual cues in a closed loop 
control architecture, play a determinant role in the per- 
formance of the tracking system. 

Binocular tracking systems have the ability to per- 
ceive target depth, which can be an important cue 
in many robotic tasks. Additionally, binocular fusion 
greatly simplifies fig are- ground segmentation, which is 
a crucial step for most applications. 

When using binocular active vision systems one has to 
address the problem of dealing with redundant percep- 
tual information and motor degrees of freedom. We de- 
fine two basic visuo-motor behaviors, Vergence and Pur- 
suit, inspired in the most influent ocular movements in 
biological systems (vergence, saccadic and smooth pur- 
suit). Each behavior only extracts the relevant informa- 
tion and controls the motions needed for its particular 
purpose. The vergence behavior controls the depth of 
the fixation point along the gaze direction while pursuit 
behavior controls the observation direction. Depth per- 
ception is attained through the extraction of disparity 
measures and directional information is obtained by a 
combination of target position and velocity in the image 
planes. 

All perceptual measurements are made on space- 
variant resolution im ages. We use the log-polar mapping 
[15], which provides a geometry similar to the distribu- 
tion of photo-receptors in the human retina, resulting 
in both perceptual and algorithmic advantages over the 
usual cartesian representation. Additionally, image size 
is reduced and the processing effort is concentrated in 
the center of the images resulting in faster algorithms 
and an implicit focus of attention in the center of the 
visual field. This last aspect is very important for track- 
ing purposes because having higher resolution in the 
center of the images, where the target is expected to be, 
the areas belonging to the target are dominant in the 
performance of the algorithms, reducing the distracting 
influence of background elements in the periphery [5]. 

One of the main concerns of our approach is related 
with robustness and real-time functionality. The use of 
very fast perceptual techniques like correlation for dis- 
parity estimation and image flow for retinal target slip, 
integrated in a closed loop fashion, allow real-time and 
reliable performance despite the low-precision of the al- 
gorithms and system calibration errors. Moreover, the 
proposed visuo-motor strategies do not assume any pre- 
vious knowledge about target shape or motion, provid- 
ing high generality in the performance of the system. 

The definition of several behaviors, competing or co- 
operating in the achievement of the same goal, usually 

IEEE Catalog Number: 97TH8280 - $ 



brings coordination problems. In the present case, as 
vergence and pursuit behaviors are decoupled in both 
perceptual and motor aspects (acquire different stim- 
uli and control different motions), they rim in a par- 
allel fashion with low internal dependency. However, 
when the vergence behavior can no longer guarantee 
correct binocular fusion on the target, motion estima- 
tion can become unreliable - in such conditions, ver- 
gence behavior inhibits the pursuit behavior. Despite 
this low internal dependency, vergence and pursuit are 
highly coupled in an external sense. First, pursuit brings 
the target to the central area of the images, enhanc- 
ing the performance of the vergence behavior. Second, 
vergence provides binocular fusion on the target, en- 
abling good figure-ground segmentation and motion es- 
timation, required for proper pursuit. In this sense, the 
tracking problem is an example of how separable be- 
haviors, tuned to difFerent goals, cooperate towards a 
common purpose. 

Results 

The tracking system is implemented in the stereo head 
Medusa [14] and runs at 12.5 Hz (half video rate) with- 
out any specific processing hardware. All processing rou- 
tines are implemented in a PENTiUM-200 computer, 
equiped with a PCI Frame Grabber, for image acqui- 
sition, and control boards, for motor control. Figure 7 
shows a sequence of images obtained during a tracking 
experiment. In the beginning, fixation is stable in the 
door at the back of the room (images 1 and 2). Once 
the target (hand) gets close to the gaze direction, the 
system starts verging and tracking the target. Notice 
that the hand is always kept very close to the center of 
the images despite background motion, target rotation 
and scaling. 



1 2 3 


4 




V.\ 

•J',* .< ■* ft 


v ; : : £;^ ^. r 

... • * 




5 


6 


7 


8 










9 


10 


11 


12 


1 .p, ' . • 




if? 


• -.'^ * 



Fig. 7.; Hand tracking. 



V. EGOMOTION ESTIMATION 

In this section, we address the problem of egomotion 
estimation for a monocular moving observer. The prob- 

557 - ISIE'97 - Guimarfies, Portugal 



): <XP 10265139A_I_> 



REST AVAILABLE COPY 



lem consists in determining the 3D motion parameters, 
by observing an image sequence over time. This is a 
real need for many robotic applications where an au- 
tonomous system must be able to estimate and/or con- 
trol its motion parameters before any higher level tasks 
can be addressed. 

The first step to estimate egomotion is the computa- 
tion of displacement between consecutive frames. Due to 
the known constraint of the aperture problem, the only 
image flow component that can be estimated baaed on 
local measurements is the normal flow 

The approach we follow is related to previous work 
|7, lo 18J, and the method consists in searching the 
image for particular geometric properties of the normal 
flow, tightly connected to the egomotion parameters. 
Hence, rather than considering the whole set of image 
flow data, we use only the image sites, that have special 
geometric properties, which convey relevant information 
about the observer motion. We split the search domain 
in several geometric figures, and estimate sets of motion 
parameters for each one of them. 

At each image point, the normal flow orovides a sin-!- 
constraint on the unknown components of the flow and 
depends non-linearly on the translation and linearly on 
the camera rotation. Instead of using the flow all over 
the image, we select only the special image sites where 
the normal flow vectors do not depend on translation- 
they correspond to the set of normal flow vectors that 
are perpendicular to the lines radiating from the FOE 
(Focus of Expansion, the projection of the camera trans- 
lation vector in the image plane). The idea is searching 
these special vectors to recover the FOE and the rota- 
tion (l.l, ey depend linearly on the rotation). In general 
the corresponding estimation methods involve computa- 
tionally demanding search algorithms [18]. We develop 
a low complexity estimator by subdividing the search 
domain, according to some geometric constraints of the 
normal flow. 

We use two types of normal flow vectors [16]- the ra- 
dtal normal flow (the set of normal flow vectors with 
radial directions) and the circular normal flow (the set 
of normal flow vectors with a direction perpendicular to 
radial hnes). The translational component of the radial 
normal flow vanishes on the T-circle having the image 
center and the FOE as diametrical opposite points. The 
translational component of the circular normal flow is 
zero on the inline going through the FOE and the image 
center On the other hand, the rotational component of 
the radial and circular normal flow is, respectively, sinu- 
soidal ,n 4> and affine in r (where r and tfi are the polar 
coordinates of the image plane). 

Two search algorithms can be denned based on th ese 
properties: (1) The tfMine algorithm searches a ra- 
dial hue passing through the image origin, such that 
the circular normal flow on this line is affine in r. Once 
the *>-l,ne has been determined, this algorithm recovers 
uniquely two constraints on the rotation and the direc- 
tion of the FOE. (2) The r-circle algorithm J consists 

est m tS V ( aVing thC 0 " gin Md a P° int of 
estimated ^-hne as opposite points) such that the radial 

normal flow ,s sinusoidal in 4>. This algorithm solves the 



BEST AVAMiLi COPv - f 

remaining egomotion parameters, namely the FOE and 
the individual rotation values. 

The sequential method described here can be gener- 
alized for other search algorithms applied for different 
types of normal flow vectors [17]. 

These algorithms were tested in a series of experi- 
ments, using both synthetic and real image data. In Se- 




fll g ; 8 i : K ReSUlt6 ° f the !P '- KnC ( top > and Circle (bottom) 
algorithms using sequences 1, 2, 3. ' 

quence 1 and 3, the camera rotates around the vertical 
axis, while translating. In Sequence 2, the camera un- 
dergoes a pure translational motion in a real cluttered 
scene. We have applied the Inline and r . circle ^ 
n thins to all image sequences. The Klines and T-circles 
found are shown in Figure 8, and intersect close the true 
location of the FOE. We used a known robust estimator 
to estimate the rotational values from the correspond- 
ing observations: the least median of squares estimator 
[9 . The estimator is designed for a simple bidimensional 
estimation problem. See [18, 17] for more details about, 
this estimation issue, namely a description of the rota- 
tional estimates (that are very close to the true values) 
In summary, our approach depends solely on spatio- 
temporal image derivatives, and is based on the subdi- 
vision of the search domain in various subspaces, which 
depend on specific geometric constraints of the normal 
flow field. To decrease the sensitivity to measurement 
noi.se we apply robust estimators on bidimensional es- 
timation problems. Robustness can be improved if more 
subspaces are explored, but this choice depends on the 
7Ztt™£ COm ^°« thC ^ accuracy 



VI. CONCLUSIONS 



IEEE Catalog Number: 97TH8280 



We have presented various visual behaviors for mobile 
robot, solving a act of relevant problem., for autonomous 
systems. Navigation behaviors are used to control a 
robot when crossing a corridor-like environment, fol- 
lowing walls, avoiding obstacles or approaching a given 
point ,n the scene with controlled orientation and speed 
Vergence and pursuit behaviors are used to track mov- 
ing objects ,n the scene, and egomotion behaviors per- 
m, to recover the vehicle motion parameters. 

Apart from some specific differences, the approach 
adopted ,s based on a number of common principle*! 



- SS5S- 



ISIE*97 - Guimarfles, Portugal 



BNSDOCID: <XP_ 



_ 102651 39A__L> 



One of the main aspects is the purposive definition 
of the sensory apparatus both at the geometric level 
(camera location and image geometry) and processing 
level (how to adjust the visual processing for each of 
the perceptual tasks). Even though the approach cannot 
be considered general, it solves with limited complexity 
relevant problems for an autonomous system. 

Another issue worth mentioning is that qualitative 
and direct visual measures are used to achieve a rea- 
sonable autonomy with limited computational power. 
The approach is based on the continuous use of vi- 
sual measures, providing a continuous stream of envi- 
ronmental information. Hence, these behaviors illustrate 
the possibility of implementing sensory-motor strategies 
where the need for a continuous motor control is not 
bounded by an "intermittent" flow of sensory informa- 
tion. 

Two factors characterize the different behaviors: the 
part of the visual field where the attention is focused on 
and the fact that the control law adopted reflects the di- 
rect link between visual information and vehicle motion. 
An explicit use of such attcntional mechanisms is done 
for the binocular tracking behavior, where a foveated 
sensor is used for the image representation. 

The fact that the "appropriate action" is totally em- 
bedded inside each single behavior, is a "very power- 
ful way of breaking a complex problem into simpler, 
tractable ones. The complexity of the perceptual pro- 
cesses tuned to each specific behavior and the need 
for "general purpose" perceptual processes becomes less 
crucial . We strongly believe that in the long run, the eco- 
nomical advantage of this approach will become evident 
for artificial systems as it is already evident for natu- 
ral living systems, and thus contribute to the design of 
truly autonomous systems. 

VII. ACKNOWLEDGMENT 

This work was partially supported by projects PRAXIS 
/3/3.1/TPR/23/94, JNICT-PB1C/TPR/2550/95 and 
EC ESPRIT/LTR N ARVAL. 

VIII. REFERENCES 

[1] J. Aloimonos. Purposive and qualitative active vi- 
sion. In Proc. ECCV90 • European Conference of 
Computer Vision, Antibes, France, April 1990. 

[2] Y. Aloimonos,!. Weiss, and A. Banddophaday. Ac- 
tive vision. Int. Journal of Computer Vision, 
l(4):333-356, January 1988. 

[3] R. Bajcsy and C. Tsikos. Perception via manipula- 
tion. In Proc. of the Int. Symp. & Exposition on 
Robots, pages 237-244, Sydney, Australia, Novem- 
ber 6-10 1988. 

14] D.H. Ballard. Animate vision. Artificial Intelli- 
gence, 48:57-86, 1991. 

[5] A. Bernardino and J . Santos- Victor . Vergence con- 
trol for robotic heads using log-polar images. In 



Proc. of the 1996 IEEE/RJS International Confer- 
ence on Intelligent Robots and Systems, pages 1264- 
1271, Osaka, Japan, November 1996. IEEE Com- 
puter Society Press. 

[6] C. Fermuller. Navigational preliminaries. In 
Y. Aloimonos, editor, Active Perception. Lawrence 
Erlbaum Associates, 1993. 

[7] C. Fermuller. Qualitative egomotion. 1JCV, 
15(l/2):7-29, June 1995. 

[8] H. Mallot, H. Bulthoff, J. Little, and S. Bohrer. In- 
verse perspective mapping simplifies opticla flow 
computation and obstacle detection. Biological Cy- 
bernetics, 64:177-185, 1991. 

[9] P.J. Rousseeuw and A.M. Leroy. Robust Regression 
& Outlier Detection. John Wiley & Sons, Inc, 1987. 

[10] G. Sandini, 

F.Gandolfo, E.Grosso, and M .Tistarelli. Vision dur- 
ing action. In Y. Aloimonos, editor, Active Percep- 
tion. Lawrence Erlbaum Associates, 1993. 

[11] J. Santos- Victor, G.Sandini, F.Curotto, 

and S.Garibaldi. Divergent stereo in autonomous 
navigation : From bees to robots. Int. Journal of 
Computer Vision, Special Issue on Qualitative Vi- 
sion, Y. Aloimonos (Ed.), 14(2):159-178, 1995. 

[12] J. Santos-Victor and G. Sandini. Uncalibrated ob- 
stacle detection using normal flow. Machine Vision 
and Applications, 9(3):130-137, 1996. 

[13] J. Santos- Victor and G. Sandini. Visual behaviors 
for docking. Computer Vision and Image Under- 
standing (to appear), 1997. 

[14] J. Santos- Victor, F. van Trigt, and J. Sentieiro. 
Medusa - a stereo head for active vision. In Proc. of 
the Int. Symposium on Intelligent Robotic Systems, 
Grenoble, France, July 1994. 

[15] E. Schwartz. Spatial mapping in the primate sen- 
sory projection : Analytic structure and relevance 
to perception. Biological Cybernetics, 25:181-194, 
1977. 

[16] C. Silva and J. Santos- Victor: Direct egomotion es- 
timation. In Proc. of the 1 3th Int. Conference on 
Pattern Recognition, Vienna,Austria, August 1996. 

[17] C. Silva and J. Santos- Victor. Rr>bnst egomotion 
estimation from the normal flow using search sub- 
spaces. Technical Report 6/96, ISR/lnst. Sup. Tec- 
nico - VisLab, 1996. 

[18] C. Silva and J. Santos- Victor. Robust egomotion 
estimation from the normal flow using search sub- 
spaces. Accepted for Publication by PAM1, 1997. 

[19] M.V. Srinivasan, M. Lehrer, W.H. Kirchner, and 
S.W. Zhang. Range perception through apparent 
image speed in freely flying honeybees. Visual Ntu- 
roscience, 6:519-535, 1991. 



IEEE Catalog Number: !>7TH8280 



- SS59 - 



ISIE'97 - Guimaraes, Portugal 



): <XP 102651 39A_J_> 



THIS PAGE BLANK (usfro) 



